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Judging Event Covariations 
I 

Abstract 

Past research indicates poor agreement abt^ut strategies people use to 
assess covariation between e^nts. This research investigates method 
of assessment as one possible source of this low consensus. A set of 
problems^ was developed in such a way that different judgment rules would 
produce different decisions about the relationships between events. 
College subjects judged these problems? then were asked to explain their 
judgment strategy. In addition? they were shown model strategies and 
asked to choose the one like their own strategy and the model that would 
be the best strategy. Subjects whose judgments indicated use of the most 
sophisticated strategy were quite a<?curate in reporting their judgment 
rules. Subjects ifsing the less accurate rules most commonly reported 
using strategies which could not have produced' the obtained pattern of 
problem solutions- These findings suggest that self-report is a weak 
basis for conclusions about sources of error in covariation judgment- 



Children's Judgments about Covariation between Events: 
A Series of Training Studies, Appendix A . 



Judging Event Covariations 

Statistical concepts represent one prime area for application of 
mathematical training. In particular, statistics ^re necessary for 
identifying predictabiJLity in an environment where relationships are 
frequently probabilistic (y is more likely when x is present) rather 

than deterministic (y always occurs when x is present) < Problems such 

,^ , 
as these are common in ideiitifying regularities in scientific phenomena, 

\ 

and in everyday contexts as well* In this resect, statistics provide 
a key link between basic raathematical concepts and central aspects of 
scientific and everyday problem solving. As an area for application of ^ 
mathematical training, rasearch on statistical reasoning may also be 
informative about children*s and adult's abilities to apply their mathematical 
skills appropriately* 

The focus of existing research in this area has been on 
probability judgments (e.g., Piaget & Inhelder,- 1975; Fischbein* 1975; 
Yost, Siegel ^ Andreses, 1962). A statistical judgment common to reasoning 
about cause-effect relationships builds on probability assessments of this sort 
An individual Investigating the relationship between potential cause x 
and effect y would compare the likelihood of y occurring when x is present 
P(y/x) with the likelihood that y occurs without P(y/s) * The two events 
are independent if _ these conditional probabilities are equal; nonifidependence 
is indicated by any difference < The comparison is made to identify » 
contingency or covariation between events* b'cientific procedu^ and 
statistical analyses testify to rhe ^ey role of covari^ition analysis In 
professional practice* Although -not sufficient for causal inference, 
covariation is a necessary condition betveetn caused and events < Thus, 
covariation analysis may identify the set of possible causes of an evmt. 
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Many psychologists further assert that everyday causal judgment is' similarly 
based on a covariation analysis (e,g*, Michotte, 1963; Inhelder & Piaget, 
\l958; Kelley, 1967; Heid^r, 1958) < That is, people search for likely 
explanations of everyday events by identifying event covariates* Thue* 
competence in covariation judgment may determine a person* s adequacy in 
identifying real* world cause^ef f ect re la c ions hips . » 

In fact, a variety of investigators have founa that adolescent and 
adult subjects shoi^ little competence in ident*lfying event covariations 
(Miemark, 1975; Smedslund, 1963; Jenkins S Ward, 1965; Adi, Karplus, Lawson, 
h P-ulos, 1978), While the evidence inditates that covariation judgments, 
are often erroneous, those judgments may be rule-governed nonetheless, ' , 
Several different rules Kave been proposed^ by past investigators as 
possible Judgment strategies. These rules are discussed in terms of possible 
relationships between two events (A and B> , ea^h of which occurs in one 
of two states (1 and 2) , ' ^ 

Least sophisticated of the proposed strategies Is Judgment according 
to the frequency with which the target events cooccur (^j^^j* a in a 

traditionally labeled contingency table) failing to consider the other 
eventr-state pairings (A^B^* ^2^1' ^^^^2^ defining th,e relationship, A 
subject using this strategy would id^entify a positive relationship between 

and Bj^ If cell a frequency were the largest of the contingency table cells ; 
a negative relationship if it vere the smallest (cell £ strategy). This 
strategy is identified by Inhelder and Piaget (1958) as common among younger 
adolescents. Smedslund (1963) and llisbett and Ross (1980) suggest that the 
strategy is typical among adults as well. The strategy does consider some 
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relevant information' and may result in better-than-chance performance. 
However, --/^lie rale considers only a limited portion of the information 
that defines the relationship and would result in erroneous judgment: of . 
many relationships - \ 

- A second possible approach would compare the number of times target 
events and cooccar with the times occprs with (comparison of 
frequencies in contingency table cells a and b; strategy £ versus b) . This 
strategy is also identified by Inhelder and Piaget (1958) as a precursor of 
mature judgment. Again this strategy considers some ofe the relevant infopm^itibn 
and may result in acSu^ate judgment of many event contingencies- However, 
failure to consider frequencies in cells c and d (event combinations A2B^ 
and AuB^) would be a particularly costiy error when the direction of that 
frequency difference is the same the difference between cells a and b- ■ 

A much improved approach would be the strategy defined by Inhelder 
and Piaget (1958) as characteristic of formal operational thinking, ■ 
Specifically, cov^iriation would be defined by comparing frequencies of. 
events confirming (cells a and d) and disconf irming (cells b and c) the 
relationship. Thus* the rule woul<l compare the sums of the diagonal cells 
in a contingency table (siim of diagonals strategy). Jenkins and l^ard 
(1965) > however, point out that this strategy has its limits as well. 
Specifically* the rule is an effective index only when the two states of 
at least one of the variables occur ecjually often- Otherwise? a correlation 
may be indicated when, in fact? independence is the case. 

Instead* Jenkins and Ward (1965) suggest that covariation is more 
appropriately evaluated by comparing the probability of event A^ given 
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event P(A^/B^) with the probability of given that B2 has\ccurred 

P(A,/B^). This is equivalent to a comparison of the frequencv ratio in 

a ^ * b ■ 

' contingency table cells with that in cells By definition, 

independence is indicated by equivalence between these conditional 

proV^ilities ; nonindependence is Indicated by any difference (conditional 



probability strategy) - This strategy should result in accui?ate judgment 
of any contingency problem.^ , , | 

Thus* four alternative strategies have been ptoposed/'to account for 
subjects* judgment pattens. Many of these rules were proposed on the , 
basis of subjects' explanations of their ^^^gments* For example. Smecislund*s 
(1963) cell a strategy is based on the reports of over half of hjs sample 
that they judged the relation of symptom A and diagnosis F according to the 
number of AF pairings. Adi^ Karplus, Lawfion, and Pulos (1978) similarly 
categorized subjects according to their explanations. In this case* however* 
no subjects were classified as using a cell a^ strategy. Rather* subjects 
des*:ribed themselves as using various combinations of two to four of the 
contingency table cells. Thus* two samples of subjects offer 'considerably ^ 
different explanations of their judgment strategies. Two features of these 
studies make it hard to reconcile these differences. First, the two reports offer 
Little information on the way the explanations were elicited. We might expect that 
different questions would result in different responses, Secondly- neither 
of the investigators repo>£ the level of agreement with which subject . 
responses were categorized, so we know little about the rpjiability of the 
categorization schemes* 

However* a more serious problem is relevant to any explanation-based 
strategy analysis- That is* such an approach is predicated on the assumption 
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that subjects are abla and willing to accurately describe their bases of 
judgt;ient. In fact* a variety of research in psychology suggests that this 
assump^iion may not be justified. I'n developmental research in particular, 
yourtg children's poor verbal skills may hinder their account of systematic 
judgment bases. Thus, verbal accounts frequently underestimate judgment ^ ^ 
competence in research with children (e.g.* Brainerd » 1973: Bullock* Gelman, 
6( Balllargeon, in press; Goldbergs 1966). Research with adults, on the 
other hand, indicates that subjects' explanations often over€fstimate judgment 
sophistication. Both expert and nonexpert judges (e.g., Goldberg, 1968j 

Nisbett £t Wilson, 1977) describe themselves a's using complex rules that bear 

'J 

little resemblence to the simpler patterns of/their actual performance. 
Ericsson and Simon (1980) note that relative accuracy of verbal reports may* 
depend on the conditions under which the information is gathered. These' 
findings would F;uggest that explanation-based analyses of Judgment! strategies 
shoulfi be treated with caution. 

An alternative approach would be to analyse judgment strategies on 
the basis of subject's actual pjerformance patterns (Ward ^ Jenkins* 1965; 
Jenkins & Ward, 1965; Shaklee & Tucker, 1980). That is,^fbur different 
rules have been proposed to account for subjects* judgments of event 
covariations. Since different rules produce different judgments* 
covariation problems could be identified which would differentiate between 
those rules. In fact* careful structuring of a problem set should allow 
us to identify the specific strategy a subject is using,, 

A set of such problems is illustrated in Table ia. Problems are 
structured hierarchically such that cell a^ problems are correctly solved 
by all strategies; strategy a versus problems are correctly solved by 
a versus b, sum of diagonals^ and conditional probability strategies. Sum 
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o£ diagonal probUms will 'be accurately judged by sum o£ diagonal and 
conditional probability strategist. Conditional probability problems 
would be correctly solved by the conditional probability strategy, alone. 
Solution accuracy is indexed. by the direction of the judged relationship 
(i.e.* A^' mor*^ likely given B^, B^* or no difference). A subject's 
solution pattern on the set of problems indicates the strategy used. 
Problems on the fir^ row of" Table la illustrate judgments predicted by v 

i 

each of the propose^ rules. All problems in the row indicate relationships 
in which is more likely given than given B^* However, an individual , 
using the cell ± strategy would judge only the first problem as such a 
relationship (cell ± is the largest of the cells). A person using .the a 
versus h strategy would accurately judge the first two problems in the row* 
but would say that A*^^ given B^ is as likely as A^^ given B2 in the third 
problem (2-2),:and that A^ was less likely given Bj^ than ^2 the laJt 
problem (2-12) < The sum of diagonals rule would result in the correct 
judgment of the fifst three problems* but would say th^^jrl^ was a^'llkely 
to occur with B^ as with B2 on the last proolem (2+10) - (12+0)/ A Object 
using the conditional -^probability I'ule should accurately ^udge ail of the 
first row problems. Table lb -identifies the solution pattern congruent 
with ^ch strategy type. The probability of tiTatchlng these judgment patterns 
by chance alone is .U for ceil a, ,04 for a versus h; .01 for sum of 

diagonal, and <005 for the conditional probability pattern, 

y 

In two experiments, Shaklee and Tuck^jr (1980) employed this diagnostic 

i 

approach to identify judgment" rtiies of 10th grade and college students. 
Subjects jydged relationshii^s in three problems for each proposed strategy 
type. Each problem consisted of 2^ instances in whijh evefit states tfere 
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defined for two events. Problems were set in conte^Cus of everyday events 

(eVg-> cake rlsei or falls with 9^ without '^special ingredient/* plin::^ 

healthy or not healthy which do or do not receive plant food). Subjects* 

performance indi^iated general conformity to. the strategy set. Congruence 

with the cell a strategy pattern was i^equent among the -high school subjects 

(17%) but rare in the college sample (1%). Response patterns matched that 

* ■ 

of the a^ versus b^ strategy for 18% of .the college sample (use of this ^ 
strategy was not tested among the high school subjects). Judgment patterns 
weile congruent with the conditional probability strategy for 17% of^the high 
school subjects and 33% of the college sample* In each experiment^ the modal/ 
response pattern conformed to that of the sura of diagonals rule (35% of the 

L " ■ ■ 

college subjects^ 41% of the high school subjects)* Subsequent studiee 
' * I 

demonstrated t^iiat children use increasingly sophisticated rules with 
increasing aga in the 4th grade to college age span (Shaklee 5rMimSf 1981), 
and that adults' tend to use simpler. rules as the decision environment 
becomt^s more complex iShaklee Mims* 1982), 

In sum, the data from several studies indicate that a carefully^ 
'Structured problem set can be profitably used to indicate strategies under- 
lying judgments of covariations between events. Subjects ^n these experiments 
demonstrated at least some sophistication about appropriate covariation 
judgment* however* the optimal judgment rule was used by a minority of subject^ 
Such judgments are particularly interesting since they build so directly^ on 
the basic mathematical understanding- of ratios and fractions* That is, people 
making covariation judgments should be comparing two conditional probabilities 
each of which is a tatio between two frequencies- Our evidence indicates 
that substantial use of such a strategy doee not occur until the 10th grade^ 
and thftn hy only a minority^of stibjec'ts. This evidence isycoRf^ruent with 
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other research indicating that problems in application of racio concepts are 
common among adults as well ,is children »,K^rplus ^ Peterson, 197o; Kurtz it 
Karplus, 1979; Capon Kuhn,^ 1979)* ^ ^ 

In add5 tion^ these findings conflict with the past interview-based 
strategy .analyses , In particular* Smedslund's (1963) onTy commonly 
reported strategy* cell a, is rarelv se^n in the performance patterns ot 
our. subjects. In light of this conflict* a direct comparison of explanation 
and judgment-based strategy analyses would be prof itabl*'.. By this approach? 
subjects would be asked to complete ^ diagnostic problem seti th^ explain 
their judgment bases, Comparison^of classification by ^the two methods might 
show areas of systematic disagreement., Jn adljition, interview responses 
offer new information in evaluating our judgment-based analysis. That is, 
subjects may describe tJhejiselves as using rules'which may differ from any 
of our proposed rules* bul^which would produce a judgment pattern on the 
problem set <So^^gruent with that of one of our rules, ^ Finally* we, learn 
something about subjects' insight into &heir own reasoning. Such under- 
standing of subjects' own impressions about their task sol^utions would be 
particularly importanc in any attempts to improve judgment competence. 
That is, training may be maximally effective wh^n it is oriented toward 
the individual's own understanding of his or her rule use. 

A secoTtd interest in this study is in subjects' evaluations of the 
adequacy of the rules^they use. Th^e using less sophisticated rule4 m^y 
or may not be aware of rule limits. This ^tudy will measure judgments of 
rSle adequacy by asking subjects to ^glve confidence ratings as they make 
their judgments in the problem set. Subjects who are less confident of* 
erroneous responses than of correc t ^responses must be aware of their rule 
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1 limitations. In addition, subjects will be asked to identify the best 
r/ule ^ong our set of proposed strategies^ 

Subjficcs for this experiment will be mal^e and female college studentG> 
since our past research suggests that this age group should provide sub- 
scanuial numbers o£ versus b^, sum o£ diagonals and conditional probability 
judges* Sex of subject will be consid^ired' as a factor in tUe design in 
light of common findings of sex differences in math skills among adolescents ^ 
and adults (e*g., Maccoby Jacklin, L974)* 

Method 

Subjects 

Subjects in the experiment were students in an introductory psychology 
class who participated in the experiment as one option in fulfillment of a 
course requirement:. Subjects ranged in age from 18 to 32 years, with a qgan ag^ 
o£ 19.42. Sixty-two female and male students participated. 
Problems ^ 

Subjects judged a set o£ 12 covariation problems, structured so that each 
of four judgment rules would produce a distinctive judgment, pattern on a problem 
set. Tat>le la lists the actual problems "used- Jhe 12 problems include three 
problems for eacrh of the fo^r strategy types. One noncontingent and two 
contingent relationships are included for each strategy problem type. 
Twelve different problem contents were developed, each of which 
'\ consisted of a set of observations (Picturing one of two states for two 

potentially related everyday events- Three problems picturtid bakery products 
which either rose or fell In association with the presence or abse^nce of ^ 
yeast, baking powder, or a "special ingredient/' In three other prt>blems, 
plants were pictured as healt)iy or sick as a possible function of the presence 
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0". absence of plane food, bJg spray, or a ''special medicine," In three 
problems people or animals were pictured as sick or healthy as a possible 
function of the presence or absence of a shot, liquid medicine, or a pill. 
The remaining three problems p-uCtured a possible association between space 
creatures appearing happy or sad in the presence or absence of one of three 
ueather conditions (snow, fog, or rain)* 

For each problem? data instances are pictured in a 2 x 2 table* In 
each case, the manipulated factor (or environmental event) defined the table 
columns (e.g* plant food, no plant food in example below), and the outcomes 
defined the table rows (plants healthy, not healchy in the example below)* 
Each problem is introduced with a paragraph describing a context in which 
several observations were made on two potentially related variables. 
Subjects were asked to look at the pictured information and to identify 
^ the relative likelihood of one of the events when the second event was 
either present or absent. An example problem follows: 

7 

A plant grower had a bunch of sick plants. He gavvi 
some of them special plant food, but some plants didn't 
get special food. Some of the plants got better but some 
of them didn-\u*H. che picture you will see how many times 
. these things happened together. The picture indicates that 
plants which were given special food were: 

+3 +2 -hi 0 ' -1 -2 -3 

much somewhat a bit jii^Jt a bit somewhat much 

more more more as lese less less 

likely likely likely likely likely likely likely 

to get" better than plants that weren't given special food* 
On your answer sheet write the scale number that best 
completes the sentence* ^ ^ 

In addirlon, after each covariation judgment subjects were asleep to rate 

their confidence as follows: 

la 
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How certain are you in the accuracy of the above response? 
123457789 10 
just guessing absolutely certain 

The 12 problems were grouped into problem blocks* including one 
problem from each strategy type. Problems within each block were arranged 
in a single random sequence. The three problem blocks were sequenced in 
a single random order. Numbers in parentheses to the left of the problems 
in Table la indicate the position of each problem in the problem sequence. 

Once Tihe problem set was completed each subject was interviewed and 
asked the following questions about his or her judgment: 

la. YouVe just completed several problems ^bout the relationship 

between events. Can you tell how you solved them? 
lb. (Experimenter turns to the last problem in the set - a conditional 
probability problem.) Can you use this problem to show nie how 
you solved it? (strategy explanation) 

2. (The participant^is shown models of the strategy types while 

they are described.) Can.^you indicate* from the models presented* 
the strategy you us*^d to solve the problems? (model choice) 

3. Overall^ which do you feel is the '*best" strategy? (best strategy) 
Each subject was tested and interviewed individually. 

Instructions 

Initial Instructions introduced the subject to the concept of covariation 
in the context of "things that go together". Naturally occurring examples 
were given of positive relationships (i.e., tall people are more likely to 
be heavy fhan short people), negiitive relationships (i.t* , it is less likely 
to rain when it is sunny th*in when it is cloudy), and unrelated events (i-e,, 
a green truck is i^st as likely to run- out of gas as a red truck). Subjects 
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were toLd chat they youLd be given some problems about hypothetical events 
that may or may not 'tend to occur together, A sample problem involving the 
occurrence of snow as it did or did not relate to atmospheric temperature 
was used to explain the stimulus materials and the problem format. Each 
subject gave a solution to the sample problem and was invited to ask 
questions about the task. Subjects were allowed to progress through the 
problems at their own pace and were encouraged to use the scratch paper 
provided if they desired. 
Results 

Results can be grouped according to their relevance ''to two issues- 
First, subjects* performances can be characterized in terms of the accuracy 
of problem solutions- Confidence ratings on these problems indicate subject 
beliefs at -t their accuracy. Secondly, judgment strategies are identified 
according to subjects' solution patterns on the problem set and their 
responses to the interview questions. 

Accuracy > Accuracy was assessed in terms of the direction of the 

judged relationship (i,e,, Ai/B^ more* less or equally likely than A^/B2)* 

Data are analysed in terms of the number of problems correct per problem 

type. Relevant means for this analysis are reported in Table 2, A sex by 

problem type analysis of variance shows a main effect of problem type 

(F(3,342) = 164,36, £ < .001) with mean accuracy of 2.88 for cell a» 2,65 ^ 

for ^ versus Jb, 1,47 for sum of diagonals, and 1,2L for conditional 

probability problems, A main effect of sex of subject was also significant 

(^(l,il4) = 6*67, £ < .01), with more problems correctly solved by males 

than by females. The sex by problem type interaction was also significant 

(£(3,342) = 3.08, £ ^ ,03), with- the greatest sex differences in accuracy 

for the sum of diagonals and conditional prob. '^ility problems (see Table 2), 

/ 
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A sex by problem type analysis of variance of confidence ratings showed 
that subjects had some insight into solution accuracy. This was reflected 
in a significant effecj: of problem type on confidence ratings, with 
confidence decreasing as problem difficulty increased (F(3,342) = 25. ^ 
^ < *001)i Mean confidence ratings were 8.5 for cell 8*4 for a versus b, 
7.8 for sum of diagonals, and 7.7 for conditional probability problems. 
ConfidCince judgments did not differ by sex either as ^ main effect or in 
interaction with problem type. 

Strategy . Each subject's patt^i.n of solution accuracy on problems of 
the four types was used to Identify his or her judgment strategy. Performance 
patterns congruent with the four strategies are illustrated in Table lb. 
A subject was said to have passed criterion. on a given problem type if he 
or she was accurate on two or more of the three problems of that type, A 
conditional probability subject should pass criterion on all problem types, 
sum of diagonals judges should pass criterion on all problem types except 
the conditional probability problems,. Judges using the a versus b^ rule 
should pass criterion on cell a^ and a^ versus b^ problemst Cell a subjects 
should pass cell problems alone. Someone who passes no criteria would 
be labeled Strategy 0. Judgment pal:terns that do not match any of these 
predicted patterns -^re cl^rsixied as **other," Classification by this method 



will be referred to as the judgment-based strategy. 

Distribution of these judgment-based classifications is illustrated for 
each the two iiexes in Table 3. These results Indicate that all subjects 



and had at least a simple understanding of the judgment to be made. Most 
frequently occurring were judgment patterns congruent with a^ versus b^ and 



passed at least one criterion, indicating that they understood the stimuli 
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conditional probability rules (36.2% and 31.9% of the samples respectively). 
Cell ^ and sum of diagonals classifications were less common (5.2% and 
15.5% respectively). Judgments of 13 subjects failed to match any of our 
proposed patterns an<5 were classified as "other". Table 3 also shows males 
as generally using more sophisticated strategies than those used by females* 
The distributions of the tvo se^es were compared by assigning each subject 
a uumber corresponding to the number of problem type criteria passed (cell 
^ = \f conditional probability = 4). ^ t^ test comparing males anc females 
on strategy classification shows the sex difference in strategy use to be 
reliable (t(lOl) = 2.68, p < .01). 

A final judgment-based strategy analysis compares the confidence ratings 
of subjects in each of the strategy classifications. A subject strategy 
by problem type analysis of variance showed no significant difference as 
a function of subject Judgment strategy (F(3,99) = 1.54, ns) . However, 
subject strategy did interact with problem type (F(9,297) = 2.68, £ < .01). 
In this interaction, subjects classified as £ versus b^> sum of diagonals, 
and conditional p^*obability judges showed parallel decreasing confidence 
as problem difficulty increased. However, cell ± judges were least confident 
on a versus b^ problems^ As in the previous analysis, confidence ratings 
also showed a main effect of problem type (F(3,297) 28.68, £ < .001). 

Independent categorizations of subjects' strategies were based on 

their responses to the interview questions^ First, subjects were asked 

to state their strategies (question U) and to demonstrate that strategy 

on a sample problem (question lb) ^ These two responses were considered 

together and coded according to whether they conformed to one of our four 

strategies. Two alternative responses were also comanon. Several subjects 

described themselves as using a variant ot the conditional probability 

a 

strategy which compared ratios of cell frequencies c with cell frequencies 
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^- This strategy would produce the same judgments a$ our conditional 
d 

probability strategy and 'will be labelled cell ratios- A second common 
response was for a subject to say that he or shr^ had just guessed. Responses 
that did nor matcti any of these categories were labelled '*other*\ Ail 
i?l^s^onses w^re independently categorized by two coders. These two raters 
agreed on 89% of their ratings. Table A illustrates these classifications 
of subjects* explanations. 

Once subjects had stated their strategies, they were shown a model 
of each of the four proposed strategies and asked to identify the one which 
most closely resembled their problem solving approach. This classification 
is referred to as model choice. . Frequency of choices of the various models 
is shown in Table 5- Responses not represented in the strategy examples 
were coded as "other"- Of these unclassifiable subjects, six said that^ they 
used more than one rule, and the^ remaining subjects said that they used some 
strategy not listed in the models- 

Finally, subjects were asked to indicate the best strategy among the 
four examples* This response will be labelled best strategy- Table 6 
lists frequencies of subjects' choices'of each of the strategies- The 
group categorized as "other" includes several subjects who thought that 
two or more categories were equally good, some subjects who thought the 
cell ratio strategy was best, and some subjects who preferred some strategy^ 
not listed in the examples. 

As in the judgment-based strategy classification* a subject's strategy 
classification on each of these three measuires was converted to a Scale 
score corresponding to *:he level of his or her classification in the 
strategy hierarchy- Since cell ratio judges should produce the same 
Judgments as conditional probability rule users* these two rules were p:rouped 
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together in these analyses , Subjects who said that they gueSsed were 
given a score of 0. Comparisons between classification methSds were made 
in terms of these scale scores. The unclassif iable subjects were not 
Included In these analyses* 

Correlations between the various strategy classifications indicate 
some congruence between methods- The correlation between judgment-based 
sttategy classification and stated strategy i^ .58 (p < -001), Classification 
of subjects by th& two methods is illustrated in Table A* Comparisons . 
between these classification systems indicate that^ differences between 
classifications by the two methods do not show a relXable direction 
(t(94) < 1, ns) - A close Inspection of Table 4 shows that performance- 
explanation congruence differed according to subjects' strategy classification- 
Subjects whose performance patterns: showed use of a conditional probability 
rule were almost uniformly accurate, in describing their strategies (97% 
of conditional probability subjects). Among the other groups combined 
(excluding "other'*) only 24% of the subjects described rules congruent with 
their performance patterns. A comparison of the two groups shows this 
difference to be reliable (X^ = df = 1 , p < -001) < ^ 

Comparison between judgment-based classification and subject's model 
choice also showed reliable congruence between the two methods (r ■ ,45, 

p < .00l)p Table 5 shows \classification of subjects by the two methods^i 

\ 

Comparison between the classification methods shows that model choices were 
neither reliably more nor less sophisticated than their judgment-based strategy 
classification (t(98) < 1, ns) , The correlation betweelci Ithe strategy explanation 
and model choice measure indicates some agreement between these two self- 
report measures (r « -53, p < .001) with the subject classifications neither 
better nor worse by the two methods (t(99) < 1, ns) - 
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Finally, subjects' selection of best strategy was compared to their 
classifications by other methods. Model choice and best strategy used 
t^e same multiple choice method, and were thus deemed to make the best 
case for comparison (see Table 6 for classification by the two methods). 
Subjects' selections of best strategy were reliably more sophisticated 
than the strategy they identified as their own (tCSS) = j,35» p < ,001), 
suggesting that subjects recognized a better way to solve the problems 
when one was provided. Their choices of best strategy were also more * 
sophisticated than their judgment-abased strategy classifications CtC84) - 7.19, 
p < .COl). 
Discussion 

These results offer considerabj.e evidence on relative Congruence 
among self-report and performance*based methods of identifying strategies 
underlying covariation judgments. All comparisons suggest some agreement 
between methods, with correlations rangi^ from .45 - .58. Correlations 
at this level indicate that subjects have some Insight into their judgment 
bases. However, closer Inspection of Tables 4 and 5 indicate that some 
subjects show considerably better insight than others. In particular, 
conditional probability subjects (judgment-based classification) are 
Impressively accurat^e, with 97% describing a conditional probability (or 
cell ratio) strategy In their strategy explanation, and 84% selecting that 
strategy in the model choice measure. In sharp contrast, all other subject 
groups show poor congruence between the performance^based and self-report 
measures, with 24% agreement between judgment and explanation measures, 
25% agreement between judgment and model choice- 

The strength of our judgment-based classification system is our ability 
to evaluate whether a stated rule would produce the obtained judgment pattern. 
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A close inspection of Table ^.^illustrates thii comparison. For exampl^e, 
no subject with a cell a judgment pattern described him or herself as 
using a cell ^ judgment rule. Our Interpretation of this difference would 
be ambiguous if these subjects described rules which would produce a cell 
a^ judgment pattern on the 'problem set. However » this was not the case. 
Half of these subjects said they were guessing, an approach which would 
yield a cell _a pattern only 11 percent of the time (i*e*, the chance 
probability of producing the pattern)-. The remaining subjects with cell 
a performance patterns said they were using cell ratios, a strategy which 
would result in a conditional probability judgment pattern. Subjects 
showing a^ versus b^ patterns also showed poor insight into rule use* with 
11 of A2 classifiable subjects describing themselves as using rules which 
should produce more errors than they actually showed » and U subjects 
describing strategies which should have produced more accurate records 
than actually obtained. Host of the subjects whose ^Jj^dgnient ^performance 
indicated sum of diagonals strategy use described strategies that would 

produce conditional probability judgment patterns* Several subjects 

' a b 

described themselves as comparing cells - with a strategy which would 

mimic a conditional probability strategy on the problem set* However* it 

is interesting to note that only one of the subjects who said they were 

using cell ratios produced a judgment pattern congruent with their described 

rule. As noted earlier, self-report and judgment pattern were congruent-Jor 

conditional probability judges. In these cases we are not simply noting 

relative agreement between performance and explanation* Our rule diagnostic 

problem set also allows us to show whether subjects' self-reported rules 

would have produced their actual perfciiance patterns* 
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One possible interpretation of poor agreement between judgment and 
^ explanation might be that subjectb shifted ^ule use at some point in the 
problem ^et. A subject may have judgad the initial problems by on** 
strategy? but chknged strategy by the end of the problem set. This 
individual's judgments might yield^a classification according to the 
initial strategy, but he or she would be accurate in describing use of a 
different strategy to solve the last problem. In fact, some of our 
subjects said that they used more than one rule in response to the model 
choice question. This possibility may explain a few judgment-explanation 
discrepancies, but our rule classification system makes it unlikely as a 
general account. That is, a subject had to accurately judge at least two 
of the three problems of each strateg}^ type to have passed criterion on 
that type. The problems were blocked such that one problem of each strategy 
type appeared in each third of the problem sequence. A subject, would have 
to shift strategy after the eighth problem of the set to have met the 
criteria for his or her initial problem solution strategy in the judgment- 
based classification. Shifts^t other points should pVoduce judgment records 
that do not conform to any of our strategy patterns. These subjects would 
be labeled "other" and not be included in our method comparisons. In fact, 
such unclassifiabie subjects were infrequent in this sample (11.2%). 

These results show that agreement between different^ self-report measures 
is limited as well. The correlation between subjects' strategy explanation 
and mod^l choice was a modest (though significant) ,53. Thus, the issae is 
not simply one of the validity of self-report of strategy use. Method of 
obtaining that self-report affects subjects' responses as well. 

These comparisons suggest that sex£-repoct may be a weak data-base for 
research on covariation judgment. We note, however, that there may be conditions 

Si 
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under which self-reports would be more arcS^Hate, Our subjects described 
their strategies after solving a series of problems. Ericsson and Simon 
(1980) argue that features of memory and attt^ntion might predict that 
reports would be erroneous under these conditions. In particuj.ar* subjects 
must retrieve th£^ relevant information from long terra memory in order to 
explain their judgment rule. Potential sources of .error include problems 
in storing or retrieving the information from long term memory and incomplete 
repor-ting of the available information* Ericsson and Simon (1980) argue 
that soch problems are minimized by gathering self-reports through a 
think aloud technique in which subjects verbalize their reasoning as 
thay solve the problem. 

Although^alternative techniques may improve self-report accuracy* 
our method is most relevant for comparison with past research in this area. 
In particular, Smedslund (1963) and Adi and colleagues (1978) each asked 
subjects to explain their strategies after making several judgments about 
event covariations. Our evidence suggests that self-report ^of less-than- 
optimal strategies will be inaccurate under these circumstances. 

Considering covariation judgment as a problem in applied mathematics, 
our findings also have implications for educational assessment. That is, 
self-report may be 3 poor method for diagnosing the sources of individual 
student's errors in applying .ratio concepts. Our findirjg of strategy 
classification differences in self-report accuracy are somewhat ironic ftaaJ 
an educational point of view. That is, the students best able to report 
their strategies vould be those who need help the least. The success of a 
program to improve thefts judgments may well depend on the starting strategy 
of the individual Involved, Our evidence indicates that student self-report ' 
is unlikely to yield an accurate diagnosis of sources of judgment error. 

23 
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Our subjects ^o show some insight into the strengths and .weaknesses 
of cheir chosen strategies* First* confidence ratings showed that subjects 
were less confident of their accuracy on problems where errors w^re high 
than on problems where error rates were low. Secondly* twice as many 
subjects selected the conditional probability rule as the best rule as 
were classified as using the rule in problem solutions (32 percent vs, 
65 percent). One might wonder why subjects would persist in usin^ a rule 
they knew was flawed. However, shifting rules requires that subjects be 
able to generate a better rule to use. This evidence indicates that subjects 
are better at., recognizing good rules than at producing those rules on their 
own. 

A final consistent finditig worth noting is the sex difference in 
judgment accuracy and strategy use. This sex difference is not surprising 
iri the light of much past research showing males better than females ift 
mathematical reasoning beginning in junior high and continuing throughout 
adulthood (Maccoby & Jacklin, 1974)- Since the conditional prob^ility 
rule builds so directly on comparisons of two ratios, we might. expect , sex 
differences in this judgment as werll. Our method offers the additional 
advantage of identifying specific strategies employed by subjects of each 
^sex, "Compared to males, females were especially unlikely to use the ^ 
conditional probability rule (19,3 percent vs, 46.3 percent), preferring 
the simpler and less accurate a. versus b, rule (41.9 percent vs. 29.6 percent), 
This difference could have several possible sources. One likely source 
is simply that the two sexes came to the experiment with different training 
backgrounds. Other studies have found males and females to be substantially 
different in participation in math courses by the time they get to college 
(Fennema, 1$77; Keeves, 1973; Hall Shaklee, note 1, National Assessment of 
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Educational Progrees* 1979). Further work would be required to assess the 
rol<a of differential math training in sex differences in covariation 

V 

judgments. ^ , 

^ , \/ 

In overview) . our results indicate that subject's self -xe'por fcfe of 
covariation judgment rules show limited congruence with actual judgment 
patterns. Self-report was an especially poor method for identifying 
sources of inaccuracy in judgment patterns. Such effects of assessment 
method offer a ready explanation £or poor agreement about strategy use 
in past studies of covariation judgment. These results suggest; that^self- 
report measure.^ are weak bases , for drawing conclusions about strategy use. 
These problemls with self-report in covariation judgment accord veil with 
other research showing poor correspondence between subjects' judgments 
and their explanations about those judgments. 
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t.ad some difficulty defining a noncontingent relationship for the 
sum-of-dtagoaals problems. The problem we included (middle problem, column 
3, Tab;^ lA) d'eviates slightly from independence (?(Aj1Bj) - PCaJB^) ^ -.06) 
by the condit:tonal-'probability rule. As a result we scored responses as 
correct if subjects concluded chac I was either less likely or just as 
likely as^j|B2^ The problem does discriminate appropriately between the 
other judgment rules. Cell^-a and a-versus-b, judges should say that 
is more likely than A^jB^, sum-of-diagor.al judges should say the two 
outcomes are equally likely. 
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B) Strategy uae and reauXtant patt«±rnfi ot problem accuracy, 
(+ • accuratCp 0 *• Inaccurate) 

Problem Strategy Type 

Cell Sum of Conditional 

a a versus Dlagonala Probability 



Subject 

Strategy 

Type 



Coadltional 
Probabilities 




— — 

+ 


+ 


+ 


Sum of 
Dlagonale 


+ 


+ 


+ 


0 


versua b 


+ 


+ 


0 


0 


Cell a 


+ 


0 


0 


0 


Stratf^gy C 


0 


0 


0 


0 



3i 



Judging Event Covariations 
30 

Table 

Mean Judgment Accuracy Per Problem Type 



sum of conditional all 

cell a^ £ versus _b diagonals probability types 

females 2.81 2.64 ^ 1.23 1.00 1.90 

males 2.96 2.65 1.72 1.43 2.20 

all 2.88 2.65 1.47 1.21 2.05 
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Table 3 

Judgment-Based Strategy Classifications 

(percentages) 

sum of conditional 

S. S. versus Jb diagonals probability othsr K 

males 3.7 29.6 IKl A6.3 9.3 5A 

females 6. A A1.9 19.3 19.3 12.9 62 

all 5.2 36.2 15.5 31.9 11.2 116 
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Table 4 



Frequencies of Strategy Classifications by Judgment-Based 
And Strategy Explanation Methods 



Strategy Explanation 
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Table 5 

Frequencies o£ Strategy Classifications by Judgment-Based 
And Model Choice Methods 



Judgment 
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sum of 
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conditional 
probability 
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all 



guess 
0 
6 



2 
1 

11 



cell _a 
1 
4 



_a versus b 
3 



Model Choice 

Sum o£ conditional 
diagonals probability 



1 
10 



20 



13 



31 



37 



other 
0 
1 



all 
6 
42 

18 

37 

13 
116 



Judging Even; Covariations 
34 



Table C 
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Stracegy 
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